public cloud
CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload
Shahbazinia, Amirhossein, Huang, Darong, Costero, Luis, Atienza, David
Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency. In multi-tenant cloud environments, virtual machines are consolidated on shared physical servers to improve resource utilization. While virtualization guarantees resource partitioning for CPU, memory, and storage, it cannot ensure performance isolation. Competition for shared resources such as last-level cache, memory bandwidth, and network interfaces often leads to severe performance degradation. Existing management techniques, including VM scheduling and resource provisioning, require accurate performance prediction to mitigate interference. However, this remains challenging in public clouds due to the black-box nature of VMs and the highly dynamic nature of workloads. To address these limitations, we propose CloudFormer, a dual-branch Transformer-based model designed to predict VM performance degradation in black-box environments. CloudFormer jointly models temporal dynamics and system-level interactions, leveraging 206 system metrics at one-second resolution across both static and dynamic scenarios. This design enables the model to capture transient interference effects and adapt to varying workload conditions without scenario-specific tuning. Complementing the methodology, we provide a fine-grained dataset that significantly expands the temporal resolution and metric diversity compared to existing benchmarks. Experimental results demonstrate that CloudFormer consistently outperforms state-of-the-art baselines across multiple evaluation metrics, achieving robust generalization across diverse and previously unseen workloads. Notably, CloudFormer attains a mean absolute error (MAE) of just 7.8%, representing a substantial improvement in predictive accuracy and outperforming existing methods at least by 28%.
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Spain > Galicia > Madrid (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- (2 more...)
AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud Platforms
Barua, Biman, Kaiser, M. Shamim
The increasing demand for scalable, efficient resource management in hybrid cloud environments has led to the exploration of AI-driven approaches for dynamic resource allocation. This paper presents an AI-driven framework for resource allocation among microservices in hybrid cloud platforms. The framework employs reinforcement learning (RL)-based resource utilization optimization to reduce costs and improve performance. The framework integrates AI models with cloud management tools to respond to challenges of dynamic scaling and cost-efficient low-latency service delivery. The reinforcement learning model continuously adjusts provisioned resources as required by the microservices and predicts the future consumption trends to minimize both under- and over-provisioning of resources. Preliminary simulation results indicate that using AI in the provision of resources related to costs can reduce expenditure by up to 30-40% compared to manual provisioning and threshold-based auto-scaling approaches. It is also estimated that the efficiency in resource utilization is expected to improve by 20%-30% with a corresponding latency cut of 15%-20% during the peak demand periods. This study compares the AI-driven approach with existing static and rule-based resource allocation methods, demonstrating the capability of this new model to outperform them in terms of flexibility and real-time interests. The results indicate that reinforcement learning can make optimization of hybrid cloud platforms even better, offering a 25-35% improvement in cost efficiency and the power of scaling for microservice-based applications. The proposed framework is a strong and scalable solution to managing cloud resources in dynamic and performance-critical environments.
- Asia > Singapore (0.05)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud
Shi, Tianyao, Yang, Yingxuan, Cheng, Yunlong, Gao, Xiaofeng, Fang, Zhen, Yang, Yongqiang
Multi-tenancy in public clouds may lead to co-location interference on shared resources, which possibly results in performance degradation of cloud applications. Cloud providers want to know when such events happen and how serious the degradation is, to perform interference-aware migrations and alleviate the problem. However, virtual machines (VM) in Infrastructure-as-a-Service public clouds are black-boxes to providers, where application-level performance information cannot be acquired. This makes performance monitoring intensely challenging as cloud providers can only rely on low-level metrics such as CPU usage and hardware counters. We propose a novel machine learning framework, Alioth, to monitor the performance degradation of cloud applications. To feed the data-hungry models, we first elaborate interference generators and conduct comprehensive co-location experiments on a testbed to build Alioth-dataset which reflects the complexity and dynamicity in real-world scenarios. Then we construct Alioth by (1) augmenting features via recovering low-level metrics under no interference using denoising auto-encoders, (2) devising a transfer learning model based on domain adaptation neural network to make models generalize on test cases unseen in offline training, and (3) developing a SHAP explainer to automate feature selection and enhance model interpretability. Experiments show that Alioth achieves an average mean absolute error of 5.29% offline and 10.8% when testing on applications unseen in the training stage, outperforming the baseline methods. Alioth is also robust in signaling quality-of-service violation under dynamicity. Finally, we demonstrate a possible application of Alioth's interpretability, providing insights to benefit the decision-making of cloud operators. The dataset and code of Alioth have been released on GitHub.
The cost and sustainability of generative AI
AI is resource intensive for any platform, including public clouds. Most AI technology requires numerous inference calculations that add up to higher processor, network, and storage requirements--and higher power bills, infrastructure costs, and carbon footprints. The rise of generative AI systems, such as ChatGPT, has brought this issue to the forefront again. Given the popularity of this technology and the likely massive expansion of its use by companies, governments, and the public, we could see the power consumption growth curve take on a concerning arc. AI has been viable since the 1970s but did not have much business impact initially, given the number of resources needed for a full-blown AI system to work.
Stop your public-cloud AI projects from dripping you dry
Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Last year, Andreessen Horowitz published a provocative blog post entitled "The Cost of Cloud, a Trillion Dollar Paradox." In it, the venture capital firm argued that out-of-control cloud spending is resulting in public companies leaving billions of dollars in potential market capitalization on the table. An alternative, the firm suggests, is to recalibrate cloud resources into a hybrid model. Such a model can boost a company's bottom line and free capital to focus on new products and growth.
- Banking & Finance (0.61)
- Government (0.41)
Hybrid AI Inferencing managed with Microsoft Azure Arc-Enabled Kubernetes
Cloud native deployment with Kubernetes orchestration has enabled the "Write Once, Deploy Anywhere" paradigm for applications. This application development and deployment model enables scale and agility in today's hybrid and multi-cloud environments. Applications or services packaged as containers can be deployed and managed with the same Kubernetes based eco-system tools in the public cloud, on premise or Edge locations. Microsoft Azure Arc-Enabled Kubernetes (Reference 1) could be viewed as one such ecosystem tool the enables central management of Kubernetes clusters deployed on premises locations or across different public clouds. Kubernetes based offerings from different vendors are supported and they need not be based on Azure Kubernetes Service (AKS) (Reference 2).
What is Cloud Analytics and Its Importance? - Analytics Vidhya
This article was published as a part of the Data Science Blogathon. The distribution of computer services through the internet is known as cloud computing. Businesses can adopt the cloud computing paradigm, where they can rent IT equipment and services instead of purchasing and operating their data centers. These services cover everything from basic infrastructures like networking, servers, storage, databases, and software to cutting-edge tools like artificial intelligence (AI) and machine learning systems. As less equipment needs to be rented and maintained, the importance of cloud analytics translates into businesses cutting expenses while boosting productivity.
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Cloud Computing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
The Technology Behind Sam's Club, Walmart's Membership Warehouse Store
While Sam's Club at first glance may seem like a typical warehouse retail membership club, look beyond the pallets of packaged food and tabletops stacked with designer clothing, and you'll see an operation committed to using technology to improve the experience of customers – or members, as they are called – and its own operations. Case in point -- the company introduced the Scan and Go mobile phone app in 2016, allowing customers to avoid the check-out lines by using their mobile phones to scan barcodes themselves and then click a button to check out and pay. The service made shopping more convenient for the members who used it in 2016. But it really stood out as visionary when the pandemic hit in 2020. Scan and Go provided a contactless shopping experience at a time when the guidance on COVID was to "social distance" by staying 6 feet away from anyone else.
- Retail (1.00)
- Information Technology > Services (0.50)
- Information Technology > Cloud Computing (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Mobile (0.75)
The real cost of cloud computing - VentureBeat - UrIoTNews
We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. The public cloud is growing rapidly and the market for the technology is expected to reach $1.3 trillion by 2025. The cloud has revolutionized the computing industry and enabled many applications, business models and enterprises, which otherwise wouldn't have been possible. Immediate availability, scalability, minimal capital expenditure and streamlined developer experience are its main advantages -- but it comes at a cost. Due to a lack of in-house infrastructure optimization capabilities, most enterprises stick to the cloud even after achieving certain maturity. To keep cloud spending under control, enterprises have built or acquired tools and services.
- Information Technology > Services (0.52)
- Information Technology > Security & Privacy (0.51)
- Information Technology > Cloud Computing (1.00)
- Information Technology > Artificial Intelligence > Speech (0.36)
Eight Storage Requirements for Artificial Intelligence and Deep Learning
Regardless of where data resides, integration with the public cloud will still be an important requirement for two reasons. First, much of the AI/DL innovation is occurring in the cloud. On-prem systems that are cloud-integrated will provide the greatest flexibility to leverage cloud-native tools. Second, we are likely to see a fluid flow of data to/from the cloud as information is generated and analyzed. An on-prem solution should simplify that flow, not limit it.